TODO:

Setting up the workspace

The first step is that we will load all the packages needed. We are going to be using the ‘haven’ package and the ‘brms’ package.

Notes on the packages

# Fits multilevel models
library("nlme")
# Also fits multilevel models
library("lme4")
# Adds p-values to lme4
library("lmerTest")
# Used for algorithms
library('optimx')
# bootstrapping
library("boot")
# Useful for ploting
library("sjPlot")
library("sjmisc")
library("sjlabelled")
# Canned Rstan models (needed for the multiple membership example)
library("brms")
# tidyverse
library("tidyverse")

Next we set the working directory. You need to specify the folder path from YOUR computer. More specifically use the folder path where the data is stored.

# Enter the folder path on your computer
setwd('C:/enter/the/folderpath/on/your/computer')

1. Basic Linear model

We are going to be using a longitudinal example. After reading in the data, we can print the dimensions of the dataset to see if it has the expected number of rows and columns.

# Reading in the data
chapman <- readRDS("longitudinal_chapman.rds")
# Checking the dataset
dim(chapman)
## [1] 140  11

Next we print a random subset of 10 rows just so we can look at the data.

id score week weeksq Week1 Week2 Week3 Week4 iq iq_c iq_c2
5 190 0 0 1 0 0 0 81 -32.4571 1053.46334
5 220 1 1 0 1 0 0 81 -32.4571 1053.46334
5 229 2 4 0 0 1 0 81 -32.4571 1053.46334
5 220 3 9 0 0 0 1 81 -32.4571 1053.46334
19 167 0 0 1 0 0 0 120 6.5429 42.80954
19 201 1 1 0 1 0 0 120 6.5429 42.80954
19 233 2 4 0 0 1 0 120 6.5429 42.80954
19 216 3 9 0 0 0 1 120 6.5429 42.80954

The important variables here are:

Plot Score vs. Time (by individuals)

Let’s briefly look at the data visually. I have randomly sampled 10 individuals and plotted their scores over time. Looks like evidence of random slopes and random intercepts. Note that ‘facet_wrap()’ is a useful feature of ggplot, allowing us make plots by specific individuals.

# Randomly sample 10 id's
set.seed(111111111)
rand <-sample(unique(chapman$id), 10)

# Plot those 10 id's individually over time. 
ggplot(data = chapman[chapman$id %in% rand,], aes(x=week, y = score)) +
  geom_point() +
  ylim(50, 300)+
  geom_smooth(method='lm',formula=y~x) +
  facet_wrap("id")

Fitting models in a frequentist framework

We are going to use the ‘lmer()’ function from the ‘lme4’ package. First I will fit a random effects anova model. Often use to get the ICC.

\[score_{ij} = \gamma_{00} + u_{0j} + r_{ij}\]

Note that I am calling the ‘lmer()’ function as ‘lme4::lmer()’. Both ‘lme4’ and ‘lmerTest’ use the ‘lmer()’ function. We have both packages loaded and R will default to using the function from the last loaded package. One way to make sure R doesn’t get confused is to add the package name before the function, as I am doing below.

# Fitting the model
fit_m0 <- lme4::lmer(score ~ 1 + (1 | id),
                     data      = chapman,
                     REML      = FALSE,
                     na.action = na.omit)

# Printing the model summary
summary(fit_m0)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + (1 | id)
##    Data: chapman
## 
##      AIC      BIC   logLik deviance df.resid 
##   1466.1   1475.0   -730.1   1460.1      137 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3435 -0.6309 -0.0363  0.6178  1.9413 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept)  574.3   23.96   
##  Residual             1583.7   39.80   
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  204.814      5.265    38.9
# Printing the ICC
tau2      <- 574.3
sigma2    <- 1583.7
ICC       <- tau2/(sigma2 +tau2)
paste("The ICC is:", round(ICC, 3))
## [1] "The ICC is: 0.266"

Let’s fit a few more models…Let’s see if there is an effect of week. This model would look like:

\[score_{ij} = \gamma_{00} + \gamma_{01}Week_{ij} + u_{0j} + r_{ij}\]

# Fitting a conditional model
fit_m1 <- lme4::lmer(score ~ 1 + week + (1 | id),
                     data      = chapman,
                     REML      = FALSE,
                     na.action = na.omit)
# summarize model
summary(fit_m1)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + week + (1 | id)
##    Data: chapman
## 
##      AIC      BIC   logLik deviance df.resid 
##   1316.1   1327.9   -654.1   1308.1      136 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.10129 -0.55873  0.08959  0.54603  2.15067 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 877.2    29.62   
##  Residual             372.3    19.30   
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  164.374      5.702   28.83
## week          26.960      1.459   18.48
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.384

But wait, where are the p-values? Well lme4, doesn’t compute p-values. For those desperate for a p-value or a test that a paramter is different than zero, we can either bootstrap them, run the model with the “lmerTest” package, which will add p-values, or we could turn to the “nlme” package which runs multilevel models and does compute p-values.

Note: testing if variance parameters are different than zero is not straight-forward. I recommend against it, or do your reading!

# Bootstrap Standard errors with the 'boot' library
b_par<-bootMer(x=fit_m1,FUN=fixef,nsim=500)
boot.ci(b_par,type="perc", index=1)
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = b_par, type = "perc", index = 1)
## 
## Intervals : 
## Level     Percentile     
## 95%   (154.7, 175.1 )  
## Calculations and Intervals on Original Scale
boot.ci(b_par,type="perc", index=2)
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = b_par, type = "perc", index = 2)
## 
## Intervals : 
## Level     Percentile     
## 95%   (24.03, 29.91 )  
## Calculations and Intervals on Original Scale
# Or we could bootstrap with the 'confint' function
confint.merMod(fit_m1, parm=c(3,4), method = "boot", nsim = 500, boot.type = "perc")
## Computing bootstrap confidence intervals ...
##                 2.5 %    97.5 %
## (Intercept) 152.81492 175.12459
## week         23.81869  30.15804
# Or we could fit the model with "lmerTest::lmer()" 
# which uses the same syntax as 'lmer()'
fit_m1 <- lmerTest::lmer(score ~ 1 + week + (1 | id),
                         data      =chapman,
                         REML      = FALSE,
                         na.action = na.omit)
# summarize model
summary(fit_m1)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 | id)
##    Data: chapman
## 
##      AIC      BIC   logLik deviance df.resid 
##   1316.1   1327.9   -654.1   1308.1      136 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.10129 -0.55873  0.08959  0.54603  2.15067 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 877.2    29.62   
##  Residual             372.3    19.30   
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  164.374      5.702  47.660   28.83   <2e-16 ***
## week          26.960      1.459 105.000   18.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.384

Lets add a random slope and use a likelihood ratio test to see if the random slope significant improves model fit.

\[score_{ij} = \gamma_{00} + \gamma_{01}Week_{ij} + u_{0j} + u_{1j}Week_{ij} + r_{ij}\]

# Adding a random slope for week
fit_m2 <- lme4::lmer(score ~ 1 + week + (1 + week | id),
                     data      = chapman,
                     REML      = FALSE,
                     na.action = na.omit)
summary(fit_m2)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + week + (1 + week | id)
##    Data: chapman
## 
##      AIC      BIC   logLik deviance df.resid 
##   1287.4   1305.0   -637.7   1275.4      134 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.27251 -0.59921  0.02611  0.61085  1.59094 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 1161.3   34.08         
##           week         127.7   11.30    -0.45
##  Residual              159.5   12.63         
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  164.374      6.031   27.26
## week          26.960      2.135   12.62
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.489
# LR test for comparing model 1 vs model 2
anova(fit_m1, fit_m2)
## Data: chapman
## Models:
## object: score ~ 1 + week + (1 | id)
## ..1: score ~ 1 + week + (1 + week | id)
##        Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
## object  4 1316.1 1327.9 -654.06   1308.1                             
## ..1     6 1287.4 1305.0 -637.68   1275.4 32.747      2  7.746e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Other things we can do

Plotting.

# Plotting the random intercepts and slopes
sjp.lmer(fit_m2, 
         type       = "re",
         facet.grid = FALSE,
         #sort.est  = "sort.all",
         y.offset   = .4)

sjp.lmer(fit_m2, 
         type       = "re",
         facet.grid = FALSE,
         sort.est   = "sort.all",
         y.offset   = .4)

# Pulling out the random intercepts
ranef(fit_m2)
## $id
##    (Intercept)         week
## 1    32.506129   5.95736738
## 2    51.466599   1.44854554
## 3    -7.447874  16.81722948
## 4    38.862232  -3.91288223
## 5    29.850259 -13.60432576
## 6     1.787951   2.13132503
## 7    -3.225640   4.52460394
## 8   -65.545984   8.53508424
## 9   -25.467672  -3.53111632
## 10   52.782842   0.08614679
## 11   20.524689  -5.75757227
## 12  -63.173175  10.79752028
## 13  -22.202581 -14.86216274
## 14   18.421235 -15.25997889
## 15  -16.154545  -8.48149363
## 16  -44.546921   0.89676568
## 17  -21.008001  12.09556126
## 18  -48.669731  17.77580514
## 19   10.286218  -7.24746611
## 20  -20.255235   3.24843261
## 21   15.323192  12.64574563
## 22   15.210063 -16.47989597
## 23   31.375026  -8.04663371
## 24   17.589785 -15.82655781
## 25   10.511332  -4.18921023
## 26   37.059813  -0.29968625
## 27   -8.906279  -4.18780226
## 28   25.235743   6.81278924
## 29   10.097049 -18.67303098
## 30   39.424327   3.89366726
## 31  -12.495909  -0.50142564
## 32  -32.109721  17.44093878
## 33    6.048776   8.58517478
## 34  -75.806575  12.19430363
## 35    2.652583  -5.02576590
# Plot random slopes and random intercepts simultaneously
sjp.lmer(fit_m2, 
         type = "rs.ri")

Or if you are feeling saucy maybe you want to save values and do have more flexibilty with your plotting. For example, we could take all of the predicted values, and plot them in ggplot2. Maybe we then feed it into plotly to get more of an interactive plot. Get creative!

# adding in the predicted scores
chapman <- as.data.frame(cbind(chapman, predict(fit_m1), predict(fit_m2)))

# Plotting m1 with fixed slopes
p1 <- ggplot(data =chapman, aes(x=week, y=predict(fit_m1), 
                               color = factor(id), group = factor(id))) +
      geom_point() +
      geom_line()
# Plotting m2 with randome slopes
p2 <-ggplot(data =chapman, aes(x=week, y=predict(fit_m2), 
                               color = factor(id), group = factor(id))) +
            geom_point() +
            geom_line()

plotly::ggplotly(p1)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
plotly::ggplotly(p2)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

uncorrelated random effects

fit_m2 <- lmer(score ~ 1 + week + (1 | id) + (-1 + week|id),
               data      = chapman,
               REML      = FALSE,
               na.action = na.omit)
summary(fit_m2)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 | id) + (-1 + week | id)
##    Data: chapman
## 
##      AIC      BIC   logLik deviance df.resid 
##   1291.3   1306.0   -640.7   1281.3      135 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.18058 -0.59396 -0.00855  0.63622  1.54051 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 1038.9   32.23   
##  id.1     week         111.4   10.55   
##  Residual              167.9   12.96   
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  164.374      5.748  35.710   28.60  < 2e-16 ***
## week          26.960      2.035  35.710   13.25 2.44e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.123

optimizers

In many cases you might decide you want to adjust your optimizer

https://cran.r-project.org/web/packages/lme4/vignettes/lmerperf.html

https://www.rdocumentation.org/packages/lme4/versions/1.1-13/topics/lmerControl

# Model with default optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
               data      = chapman,
               REML      = FALSE,
               na.action = na.omit,
               control   = lmerControl(optimizer = "bobyqa"))
summary(fit_m2)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
##    Data: chapman
## Control: lmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   1287.4   1305.0   -637.7   1275.4      134 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.27251 -0.59921  0.02611  0.61085  1.59094 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 1161.3   34.08         
##           week         127.7   11.30    -0.45
##  Residual              159.5   12.63         
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  164.374      6.031  35.000   27.26  < 2e-16 ***
## week          26.960      2.135  35.000   12.62 1.38e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.489
# adding the 'nloptwrap' optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
               data      = chapman,
               REML      = FALSE,
               na.action = na.omit,
               control   = lmerControl(optimizer = "nloptwrap"))
summary(fit_m2)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
##    Data: chapman
## Control: lmerControl(optimizer = "nloptwrap")
## 
##      AIC      BIC   logLik deviance df.resid 
##   1287.4   1305.0   -637.7   1275.4      134 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.27258 -0.59923  0.02614  0.61081  1.59101 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 1161.7   34.08         
##           week         127.7   11.30    -0.45
##  Residual              159.5   12.63         
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  164.374      6.032  34.980   27.25  < 2e-16 ***
## week          26.960      2.135  35.000   12.62 1.38e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.490
# The 'optimx' package has more optimzers. Here I use the 'nlminb' optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
               data       = chapman,
               REML       = FALSE,
               na.action  = na.omit,
               control    = lmerControl(optimizer = "optimx", calc.derivs = FALSE,
                                        optCtrl    = list(method = "nlminb", 
                                        starttests = FALSE, kkt = FALSE)))
summary(fit_m2)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
##    Data: chapman
## Control: 
## lmerControl(optimizer = "optimx", calc.derivs = FALSE, optCtrl = list(method = "nlminb",  
##     starttests = FALSE, kkt = FALSE))
## 
##      AIC      BIC   logLik deviance df.resid 
##   1287.4   1305.0   -637.7   1275.4      134 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.27251 -0.59921  0.02611  0.61085  1.59094 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 1161.3   34.08         
##           week         127.7   11.30    -0.45
##  Residual              159.5   12.63         
## Number of obs: 140, groups:  id, 35
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  164.374      6.031  35.000   27.26  < 2e-16 ***
## week          26.960      2.135  35.000   12.62 1.38e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## week -0.489

Autocorrelation (use ‘nlme’ instead of ‘lme4’)

While the lme4 package has some great features. It also has some disadvantages– one disadvantage is that it is not very good for adding in auto-correlation. For that we will use the ‘nlme’ package which makes it much easier to specify autocorrelation

For more info on possible correlation strucures available in ‘nlme’

https://stat.ethz.ch/R-manual/R-devel/library/nlme/html/corClasses.html

# Fitting the same random slopes model - NO autocorrelation
# Note that I have specified missing data actions and an optimizer
fit_m2_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
                    data      = chapman,
                    na.action ="na.omit",
                    control   = lmeControl(opt='optim')) # defaults to 'nlminb'
summary(fit_m2_nlme)
## Linear mixed-effects model fit by REML
##  Data: chapman 
##        AIC      BIC    logLik
##   1278.823 1296.386 -633.4114
## 
## Random effects:
##  Formula: ~1 + week | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 34.62343 (Intr)
## week        11.50655 -0.45 
## Residual    12.62842       
## 
## Fixed effects: score ~ 1 + week 
##                Value Std.Error  DF  t-value p-value
## (Intercept) 164.3743  6.118861 104 26.86354       0
## week         26.9600  2.166604 104 12.44344       0
##  Correlation: 
##      (Intr)
## week -0.489
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -2.2720636 -0.6003582  0.0263138  0.6073054  1.5843972 
## 
## Number of Observations: 140
## Number of Groups: 35
# Adding an autoregressive process of order 1.
fit_m2AR_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
                     data        =chapman,
                     na.action   ="na.omit",
                     control     = lmeControl(opt='optim'),
                     correlation = corAR1())
summary(fit_m2AR_nlme)
## Linear mixed-effects model fit by REML
##  Data: chapman 
##        AIC      BIC   logLik
##   1280.682 1301.173 -633.341
## 
## Random effects:
##  Formula: ~1 + week | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 34.88366 (Intr)
## week        11.65159 -0.451
## Residual    12.02219       
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | id 
##  Parameter estimate(s):
##        Phi 
## -0.1084264 
## Fixed effects: score ~ 1 + week 
##                 Value Std.Error  DF  t-value p-value
## (Intercept) 164.41221  6.113990 104 26.89115       0
## week         26.91905  2.156897 104 12.48045       0
##  Correlation: 
##      (Intr)
## week -0.485
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.329126837 -0.619086277  0.007856578  0.645003940  1.589727248 
## 
## Number of Observations: 140
## Number of Groups: 35
# adding an autoregressive moving average process
# p and q arguments are arbitrary here. I'd think more deeply about
# how you want to specify those
fit_m2AR_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
                     data        = chapman,
                     na.action   ="na.omit",
                     control     = lmeControl(opt='optim'),
                     correlation = corARMA(p=1,q=1))
summary(fit_m2AR_nlme)
## Linear mixed-effects model fit by REML
##  Data: chapman 
##        AIC      BIC    logLik
##   1282.523 1305.941 -633.2617
## 
## Random effects:
##  Formula: ~1 + week | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 34.60435 (Intr)
## week        11.48149 -0.446
## Residual    12.44719       
## 
## Correlation Structure: ARMA(1,1)
##  Formula: ~1 | id 
##  Parameter estimate(s):
##       Phi1     Theta1 
## -0.9525161  0.9126944 
## Fixed effects: score ~ 1 + week 
##                 Value Std.Error  DF  t-value p-value
## (Intercept) 164.45797  6.099994 104 26.96035       0
## week         26.90339  2.152586 104 12.49817       0
##  Correlation: 
##      (Intr)
## week -0.484
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.25965587 -0.60044419  0.03105887  0.62690880  1.58419500 
## 
## Number of Observations: 140
## Number of Groups: 35
#Does the ARMA autocorrelation improve model fit?
anova(fit_m2_nlme, fit_m2AR_nlme)
##               Model df      AIC      BIC    logLik   Test   L.Ratio
## fit_m2_nlme       1  6 1278.823 1296.386 -633.4114                 
## fit_m2AR_nlme     2  8 1282.523 1305.941 -633.2617 1 vs 2 0.2993853
##               p-value
## fit_m2_nlme          
## fit_m2AR_nlme   0.861

Fitting in bayesian

For the sake of time I am going to skip to the random slopes model, and show how we might fit this in a bayesian framework. To do this is R is actually rather simple. So simple, we could just use the exact same lme4 code with the ‘brm()’ function instead of ‘lmer()’. Just like this.

# Fitting the brms model
fit_m0b <- brm(score ~ 1 + week + (1 + week | id), 
               data = chapman)

But this may be a bit naive, because the ‘brm()’ function has many built in defaults (priors, burnin/warmup iterations, sampling iterations, chains, etc). Below, is the exact same model, but I have made some of the important defaults explicit.

# Fitting the brms model
fit_m2b <- brm(score ~ 1 + week + (1 + week | id), 
               data    = chapman,
               prior   = NULL,
               autocor = NULL,
               chains  = 4,
               iter    = 2000,
               warmup  = 1000)
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 3.188 seconds (Warm-up)
##                1.15 seconds (Sampling)
##                4.338 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 3.118 seconds (Warm-up)
##                0.969 seconds (Sampling)
##                4.087 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.856 seconds (Warm-up)
##                1.022 seconds (Sampling)
##                3.878 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 3.433 seconds (Warm-up)
##                1.038 seconds (Sampling)
##                4.471 seconds (Total)
summary(fit_m2b)
##  Family: gaussian(identity) 
## Formula: score ~ 1 + week + (1 + week | id) 
##    Data: chapman (Number of observations: 140) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; 
##          total post-warmup samples = 4000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~id (Number of levels: 35) 
##                     Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)          35.37      4.79    27.29    46.20       1364    1
## sd(week)               11.85      1.89     8.51    15.78       1237    1
## cor(Intercept,week)    -0.40      0.16    -0.68    -0.06       1594    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept   164.32      6.30   152.12   176.93        614    1
## week         26.93      2.26    22.54    31.36       1355    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    12.95      1.13    10.95    15.34       1787    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Plotting the posterior distrubtions and the effects is especially useful here

# plotting paramters
plot(fit_m2b)

stanplot(fit_m2b)

That correlation looks different than zero in the posterior distributions but equal to zero in the effects plots? Turns out its just because we are mixing up parameters with very different scales. Lets plot just the correlation effect.

stanplot(fit_m2b, pars = "cor_id")

Priors

I’m mostly taking a pass on priors, because I’m not particularly savy with Bayesian methods.

# Fitting the brms model
fit_m2b_priors <- brm(score ~ 1 + week + (1 + week | id),
                      data    = chapman,
                      prior   = set_prior(prior = "normal(0,10)",
                                          class = "b"),
                      autocor = NULL,
                      chains  = 4,
                      iter    = 2000,
                      warmup  = 1000
                      )
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 3.026 seconds (Warm-up)
##                1.258 seconds (Sampling)
##                4.284 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.508 seconds (Warm-up)
##                0.92 seconds (Sampling)
##                3.428 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.647 seconds (Warm-up)
##                0.929 seconds (Sampling)
##                3.576 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.578 seconds (Warm-up)
##                1.208 seconds (Sampling)
##                3.786 seconds (Total)
summary(fit_m2b_priors)
##  Family: gaussian(identity) 
## Formula: score ~ 1 + week + (1 + week | id) 
##    Data: chapman (Number of observations: 140) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; 
##          total post-warmup samples = 4000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~id (Number of levels: 35) 
##                     Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)          35.62      4.82    27.65    46.53       1267    1
## sd(week)               11.99      1.97     8.61    16.37       1091    1
## cor(Intercept,week)    -0.41      0.16    -0.68    -0.05       1131    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept   166.17      6.38   153.75   178.58        675    1
## week         25.64      2.22    21.21    29.77       1210    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    12.96      1.15    10.93    15.43       1750    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).
# plotting paramters
plot(fit_m2b_priors)

stanplot(fit_m2b_priors)

autocorrelations

# Fitting the brms model
fit_m2b_AR <- brm(score ~ 1 + week + (1 | id),
                  data    = chapman,
                  prior   = set_prior(prior = "normal(0,10)",
                                      class = "b"),
                  autocor = cor_ar(~week|id, p=1), # Adding an AR 1 process here
                  chains  = 4,
                  iter    = 2000,
                  warmup  = 1000
                  )
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.784 seconds (Warm-up)
##                1.349 seconds (Sampling)
##                4.133 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.625 seconds (Warm-up)
##                1.402 seconds (Sampling)
##                4.027 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 2.627 seconds (Warm-up)
##                1.311 seconds (Sampling)
##                3.938 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 3.101 seconds (Warm-up)
##                1.487 seconds (Sampling)
##                4.588 seconds (Total)

2. Binary Outcomes

# Reading in the data
thai <- readRDS('binary_thai.rds')
# Checking the dataset
dim(thai)
## [1] 8582    8
schoolid male pped rep1 l_enrc factotc mtequalc msesc
180210 1 1 0 0.95 0.79 0.27 0.47
180210 0 1 0 0.95 0.79 0.27 0.47
180210 0 0 0 0.95 0.79 0.27 0.47
180210 0 1 0 0.95 0.79 0.27 0.47
180210 1 1 0 0.95 0.79 0.27 0.47
180210 1 0 0 0.95 0.79 0.27 0.47
180210 1 1 0 0.95 0.79 0.27 0.47
180210 0 1 0 0.95 0.79 0.27 0.47
180210 0 1 0 0.95 0.79 0.27 0.47
180210 0 1 0 0.95 0.79 0.27 0.47

Some of the variables we are going to use are

# Fitting a null model
fit_m0 <- glmer(pped ~ 1+ (1 |schoolid),
                family                 = "binomial", 
                data                   = thai,
                glmerControl(optimizer = "bobyqa"))
summary(fit_m0)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: pped ~ 1 + (1 | schoolid)
##    Data: thai
## Control: glmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   7215.8   7230.0  -3605.9   7211.8     8580 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.3048 -0.2980  0.1255  0.4049  4.7030 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schoolid (Intercept) 9.13     3.022   
## Number of obs: 8582, groups:  schoolid, 411
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -0.3784     0.1567  -2.415   0.0157 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Adding predictors and random slopes
fit_m1 <- glmer(pped~1+male+ msesc+ (1+male|schoolid),
                family                 = "binomial", 
                data                   = thai,
                glmerControl(optimizer ="bobyqa"))
summary(fit_m1)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: pped ~ 1 + male + msesc + (1 + male | schoolid)
##    Data: thai
## Control: glmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   6838.0   6880.1  -3413.0   6826.0     8335 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.8928 -0.2904 -0.1076  0.4046  4.8674 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  schoolid (Intercept) 6.4034   2.5305        
##           male        0.1993   0.4465   -0.18
## Number of obs: 8341, groups:  schoolid, 399
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.36423    0.14508  -2.510   0.0121 *  
## male        -0.14345    0.07706  -1.862   0.0627 .  
## msesc        3.92377    0.32744  11.983   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##       (Intr) male  
## male  -0.291       
## msesc -0.037 -0.028
#putting confidence intervals around effects
confint(fit_m1, parm = "beta_", method="Wald") 
##                  2.5 %       97.5 %
## (Intercept) -0.6485839 -0.079872048
## male        -0.2944821  0.007576801
## msesc        3.2819954  4.565539692
# Odds ratios instead of log-odds
exp(fixef(fit_m1))
## (Intercept)        male       msesc 
##   0.6947328   0.8663618  50.5906880

Optimizers:

Optimizers are not as good in R. In fact, I think they are fairly limited from what I have seen. For example, the default number of adaptive quadrature points is 1 (which is equivalent to laplace approximation). We can up the number of quadrature points, but only if our models only have a random intercept. Any additional random slopes and the nADQ cannot be more than 1.

fit <-glmer(pped~1+male+msesc +(1|schoolid),
            family                 = "binomial", 
            data                   = thai,
            nAGQ                   = 5, 
            glmerControl(optimizer ="bobyqa"))

But as soon as we add a random slope this model will not run.

fit <- glmer(pped~1+male+msesc +(1+male|schoolid),
             family                 ="binomial", 
             data                   =thai,
             nAGQ                   = 5, 
             glmerControl(optimizer ="bobyqa"))

As we did before we can change the optimizer to ‘nloptwrap’.

fit <- glmer(pped~1+male+msesc +(1+male|schoolid),
             family       ="binomial", 
             data         =thai,
             nAGQ         = 1, 
             glmerControl = lmerControl(optimizer = "nloptwrap"))
## Warning: extra argument(s) 'glmerControl' disregarded

Binary with ‘brms’

fit <- brm(pped ~ 1 + male + msesc + (1|schoolid),
           family = "binomial", 
           data   = thai,
           chains = 2)
## Using the maximum of the response variable as the number of trials.
## Only 2 levels detected so that family 'bernoulli' might be a more efficient choice.
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'binomial(logit) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0.006 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 60 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 132.65 seconds (Warm-up)
##                157.117 seconds (Sampling)
##                289.767 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'binomial(logit) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0.004 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 40 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 144.34 seconds (Warm-up)
##                57.212 seconds (Sampling)
##                201.552 seconds (Total)
summary(fit)
##  Family: binomial(logit) 
## Formula: pped ~ 1 + male + msesc + (1 | schoolid) 
##    Data: thai (Number of observations: 8341) 
## Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1; 
##          total post-warmup samples = 2000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~schoolid (Number of levels: 399) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     2.56      0.14     2.29     2.84        390    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    -0.35      0.14    -0.63    -0.08        289 1.00
## male         -0.15      0.07    -0.28    -0.02       2000 1.00
## msesc         3.90      0.34     3.26     4.57        187 1.01
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)

stanplot(fit)

3. Cross Classified

data("Penicillin")
dim(Penicillin)
## [1] 144   3
head(Penicillin)
##   diameter plate sample
## 1       27     a      A
## 2       23     a      B
## 3       26     a      C
## 4       23     a      D
## 5       23     a      E
## 6       21     a      F

Description of the data

The data are described in Davies and Goldsmith (1972) as coming from an investigation to “assess the variability between samples of penicillin by the B. subtilis method. In this test method a bulk-inoculated nutrient agar medium is poured into a Petri dish of approximately 90 mm. diameter, known as a plate. When the medium has set, six small hollow cylinders or pots (about 4 mm. in diameter) are cemented onto the surface at equally spaced intervals. A few drops of the penicillin solutions to be compared are placed in the respective cylinders, and the whole plate is placed in an incubator for a given time. Penicillin diffuses from the pots into the agar, and this produces a clear circular zone of inhibition of growth of the organisms, which can be readily measured. The diameter of the zone is related in a known way to the concentration of penicillin in the solution.”

p3 <- ggplot(data = Penicillin, aes(y=diameter, x=plate, color = sample, group = sample)) +
       geom_point()+
       geom_line()

plotly::ggplotly(p3)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Frequentist

fit <- lme4::lmer(diameter ~ 1 + (1|plate) + (1|sample),
                  data      = Penicillin,
                  REML      = FALSE,
                  na.action = na.omit,
                  control   = lmerControl(optimizer = "bobyqa"))
summary(fit)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
##    Data: Penicillin
## Control: lmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##    340.2    352.1   -166.1    332.2      140 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.08031 -0.67227  0.06412  0.58356  2.97933 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  plate    (Intercept) 0.7150   0.8456  
##  sample   (Intercept) 3.1352   1.7706  
##  Residual             0.3024   0.5499  
## Number of obs: 144, groups:  plate, 24; sample, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  22.9722     0.7446   30.85

Bayesian

fit <- brm(diameter ~ 1 + (1|plate) + (1|sample),
            data   = Penicillin,
            warmup = 5000,
            iter   = 10000)
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 6.873 seconds (Warm-up)
##                9.935 seconds (Sampling)
##                16.808 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 6.818 seconds (Warm-up)
##                8.843 seconds (Sampling)
##                15.661 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 8.537 seconds (Warm-up)
##                9.862 seconds (Sampling)
##                18.399 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 6.876 seconds (Warm-up)
##                8.183 seconds (Sampling)
##                15.059 seconds (Total)
## Warning: There were 70 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: Examine the pairs() plot to diagnose sampling problems
summary(fit)
## Warning: There were 70 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
##  Family: gaussian(identity) 
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample) 
##    Data: Penicillin (Number of observations: 144) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1; 
##          total post-warmup samples = 20000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~plate (Number of levels: 24) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     0.89      0.15     0.65     1.23       2973    1
## 
## ~sample (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     2.61      1.11      1.3     5.65       3215    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    22.99      1.21    20.55    25.49       2287    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma     0.56      0.04     0.49     0.64      11352    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)

stanplot(fit)

From BRMS Whenever you see the warning “There were x divergent transitions after warmup.” you should really think about increasing adapt_delta. To do this, write control = list(adapt_delta = ), where should usually be value between 0.8 (current default) and 1. Increasing adapt_delta will slow down the sampler but will decrease the number of divergent transitions threatening the validity of your posterior samples.

fit <- brm(diameter ~ 1 + (1|plate) + (1|sample),
            data    = Penicillin,
            warmup  = 5000,
            iter    = 10000,
            control = list(adapt_delta = .99))
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 22.204 seconds (Warm-up)
##                22.347 seconds (Sampling)
##                44.551 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 24.706 seconds (Warm-up)
##                29.31 seconds (Sampling)
##                54.016 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 22.433 seconds (Warm-up)
##                21.742 seconds (Sampling)
##                44.175 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 22.197 seconds (Warm-up)
##                24.449 seconds (Sampling)
##                46.646 seconds (Total)
## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: There were 43 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
## Warning: Examine the pairs() plot to diagnose sampling problems
summary(fit)
## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
##  Family: gaussian(identity) 
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample) 
##    Data: Penicillin (Number of observations: 144) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1; 
##          total post-warmup samples = 20000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~plate (Number of levels: 24) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     0.89      0.15     0.65     1.23       2901    1
## 
## ~sample (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     2.63       1.2     1.28     5.73       3768    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    22.97      1.18    20.66    25.31       3034    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma     0.56      0.04     0.49     0.64      12129    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)

stanplot(fit)

But I couldn’t get this one to run without any errors :/

Another example

cross <- read.csv("C:/users/mgiordan/desktop/crossclassified.csv")
dim(cross)
## [1] 11200    13
kable(head(cross))
id year policy state citi_ideo time legprofess popdens diffus ideodist ideo_zero outcome dem
1 1994 1 alabama 39.415 1 0.158 83.2 0 13.3256 9.534531 0 1
1 1994 13 alabama 39.415 1 0.158 83.2 0 0.0000 9.534531 0 1
1 1994 4 alabama 39.415 1 0.158 83.2 0 0.0000 9.534531 0 1
1 1994 7 alabama 39.415 1 0.158 83.2 0 0.0000 9.534531 0 1
1 1994 10 alabama 39.415 1 0.158 83.2 0 0.0000 9.534531 0 1
1 1994 9 alabama 39.415 1 0.158 83.2 0 0.0000 9.534531 0 1
fit <- lme4::glmer(outcome ~ 1 + (1|state) + (1|policy),
                   family = "binomial",
                   data   = cross)
summary(fit)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: outcome ~ 1 + (1 | state) + (1 | policy)
##    Data: cross
## 
##      AIC      BIC   logLik deviance df.resid 
##   2856.2   2877.7  -1425.1   2850.2     9370 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.4944 -0.2191 -0.1678 -0.1274 10.1235 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  state  (Intercept) 0.4495   0.6705  
##  policy (Intercept) 0.3580   0.5983  
## Number of obs: 9373, groups:  state, 50; policy, 14
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -3.4432     0.1969  -17.49   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fit <- lme4::glmer(outcome ~ 1 + time + (1|state) + (1|policy),
                   family = "binomial",
                   data   = cross)
summary(fit)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: outcome ~ 1 + time + (1 | state) + (1 | policy)
##    Data: cross
## 
##      AIC      BIC   logLik deviance df.resid 
##   2575.4   2604.0  -1283.7   2567.4     9369 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.9184 -0.2055 -0.1238 -0.0714 30.2225 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  state  (Intercept) 1.033    1.0165  
##  policy (Intercept) 0.738    0.8591  
## Number of obs: 9373, groups:  state, 50; policy, 14
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -5.82404    0.33103  -17.59   <2e-16 ***
## time         0.24133    0.01536   15.71   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## time -0.537
fit <- lme4::glmer(outcome ~ 1 + citi_ideo + legprofess + popdens +
                   diffus + ideodist + ideo_zero + dem + time +
                   (1|state) + (1|policy),
                   family = "binomial",
                   data   = cross)

Briefly on other count models

I don’t have data to show for these but I thought I’d share general syntax if you were to do some count models. Notice for poisson we only change the family argument.

# poisson
fit <- lme4::glmer(y ~ 1 + x1 + x2 + (1 | groupvar),
                   family = "poisson",
                   data = mydata)

Negative binomial might be possible, but you could be better off in something like stata too…

Note from the lme4 package: “Parts of glmer.nb() are still experimental and methods are still missing or suboptimal. In particular, there is no inference available for the dispersion parameter ?? , yet.”

# Negative binomial
fit <- lme4::glmer.nb(y ~ 1 + x1 + x2 + (1 | groupvar)
                      data = mydata)

The ‘brms’ package also supports a large number of distributions and link functions for the response variable.

see section on ‘brmsfamily’: https://cran.r-project.org/web/packages/brms/brms.pdf

4. Multiple Membership

# Reading in the data
nurses <- readRDS("mm_nursing.rds")

Describing the variables and data

Checking the dimensions of the dataset we have 1000 cases and 66 variables

# Checking the dataset
dim(nurses)
## [1] 1000   66

I am going to subset only the variables of interest

nurses <- nurses[, c("patient", "satis", "assess", 
                     "n1st", "n2nd", "n3rd", "n4th",
                     "p1st", "p2nd", "p3rd", "p4th")]

and we can print some of the raw data to see what it looks like

# Set a seed and randomly sample from rows 1-1000.
# The random sample here isn't important, only useful for displaying data.
set.seed(872017)
randomRows <- sample(1:1000,10)
# Print the randomly sampled rows
kable(head(nurses[randomRows,], n = 10))
patient satis assess n1st n2nd n3rd n4th p1st p2nd p3rd p4th
529 -0.0569810 -0.3350042 25 11 0 0 0.33 0.67 0.00 0
126 0.1723346 -1.1262732 17 0 0 0 1.00 0.00 0.00 0
319 1.4107995 -0.8070087 11 0 0 0 1.00 0.00 0.00 0
422 1.0342109 1.4629703 3 17 0 0 0.74 0.26 0.00 0
432 -1.2950493 -1.4369467 8 20 0 0 0.51 0.49 0.00 0
336 0.0504741 -0.4920052 5 0 0 0 1.00 0.00 0.00 0
370 -0.0277822 -1.1063883 21 0 0 0 1.00 0.00 0.00 0
560 0.1216989 -1.3557973 12 23 0 0 0.61 0.39 0.00 0
711 -0.8439955 -1.4817982 24 2 14 0 0.41 0.46 0.13 0
867 -1.4477801 -1.5170703 15 8 12 0 0.48 0.35 0.17 0

Variables

  • patient: and id for patients
  • satis: a satisfacation score
  • assess: an assessment score
  • n1-n4: ID’s for the 1st through 4th nurses who tended patient
  • p1-p4: the proportion of time spent with each nurse

So the general idea for this model is that each individual patient is nested within multiple nurses…which is a multiple membership model! We are going to weight the effect of each nurse by the proportion of time spent with each nurse.

Fitting the multiple membership model with ‘brms’

Unfortunately, our usual multilevel modeling packages such as ‘lme4’ and ‘nlme’, cannot fit multiple membership models. So we turn to the bayesian package ‘brms’ which makes fitting these in R possible. I am using defaults for priors, chains, and I am setting the burn-in and iteration. If you would like to change these or do not know what these are I suggest reading some of the documentation for ‘brms’.

# model syntax
# This will spit out lots of information about the model fitting process
fit_mm <- brm(satis ~ 1 + assess + (1 | mm(n1st, n2nd, n3rd, n4th, 
                                           weights = cbind(p1st, p2nd, p3rd, p4th))),
              data   = nurses,
              warmup = 5000,
              iter   = 10000)
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 20.073 seconds (Warm-up)
##                18.607 seconds (Sampling)
##                38.68 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 18.555 seconds (Warm-up)
##                17.118 seconds (Sampling)
##                35.673 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 18.2 seconds (Warm-up)
##                17.215 seconds (Sampling)
##                35.415 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 10000 [  0%]  (Warmup)
## Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Iteration: 10000 / 10000 [100%]  (Sampling)
## 
##  Elapsed Time: 19.208 seconds (Warm-up)
##                16.641 seconds (Sampling)
##                35.849 seconds (Total)
# Printing the model parameters
summary(fit_mm)
##  Family: gaussian(identity) 
## Formula: satis ~ 1 + assess + (1 | mm(n1st, n2nd, n3rd, n4th, weights = cbind(p1st, p2nd, p3rd, p4th))) 
##    Data: nurses (Number of observations: 1000) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1; 
##          total post-warmup samples = 20000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Group-Level Effects: 
## ~mmn1stn2ndn3rdn4th (Number of levels: 26) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     0.55      0.09      0.4     0.76       3210    1
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    -0.03      0.11    -0.25     0.19       2023    1
## assess        0.49      0.03     0.44     0.54      20000    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma     0.77      0.02     0.73      0.8      20000    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Plotting the results

# Plotting results
plot(fit_mm)

stanplot((fit_mm))